Implement native async client by joe-clickhouse · Pull Request #617 · ClickHouse/clickhouse-connect

joe-clickhouse · 2026-01-15T21:39:11Z

Summary

Replaces the old executor-based AsyncClient, which wrapped the sync HttpClient in a ThreadPoolExecutor, with a native async implementation built on aiohttp. The public API surface is unchanged: clickhouse_connect.get_async_client() returns an AsyncClient with the same methods. The difference is entirely under the hood, where real async I/O replaces thread-pool delegation.

Why this change

The previous AsyncClient ran every operation in a thread pool via loop.run_in_executor(). This:

added thread overhead and context switching
limited the actual benefits of async I/O
complicated resource and session management

The new implementation performs HTTP I/O natively with aiohttp, giving real concurrency benefits for async workloads.

Design

Native async I/O

Requests use aiohttp.ClientSession with a configurable TCPConnector (pool limits, keepalive). HTTP response handling is fully async. aiohttp is an optional dependency installed via pip install clickhouse-connect[async].

Streaming bridge for ClickHouse Native format

Native format parsing and serialization is synchronous CPU-bound work. The client uses a bounded queue in AsyncSyncQueue as a sync/async bridge so async network reads/writes can overlap with sync parsing/serialization in an executor.

On the query path in StreamingResponseSource, the async producer reads from the aiohttp response and the sync consumer parses in an executor. On the insert path in StreamingInsertSource, the sync producer serializes in an executor and the async consumer streams to aiohttp.

Event loop safety

Non-streaming queries methods like .query(), .query_df(), etc. are fully materialized inside the executor before returning. By the time a QueryResult is returned, all data is in memory, so synchronous iteration won't block the loop.

Streaming queries like .query_rows_stream(), .query_df_stream(), etc. detect synchronous iteration from within an async context and raise ProgrammingError immediately, prompting the caller to use async for instead.

Lazy imports

aiohttp is imported lazily so import clickhouse_connect works without it installed. Attempting to use the async client without aiohttp raises a clear ImportError with install instructions. Heavy optional dependencies (numpy, pandas, pyarrow, polars) are also lazily loaded, matching the sync client.

Breaking changes

AsyncClient(client=sync_client) no longer works. Use get_async_client() or create_async_client().
The executor_threads and executor parameters have been removed from create_async_client().
pool_mgr is rejected on the async path with a message pointing to connector_limit / connector_limit_per_host.
The internal module clickhouse_connect.driver.aiohttp_client no longer exists. AsyncClient is importable from clickhouse_connect.driver as before.

Migration

# Before (no longer works):
sync_client = clickhouse_connect.get_client()
async_client = AsyncClient(client=sync_client)

# After:
async_client = await clickhouse_connect.get_async_client(host="...", port=8123)

The async client API is otherwise identical. All query, insert, and streaming methods have the same signatures.

Tests

The full integration test suite runs parametrized across both sync and async clients. Dedicated async tests in test_async_features.py cover concurrency, streaming cleanup, session protection, timeouts, and error isolation.

Performance

Benchmarks comparing the old executor-based client against the native async client showed speedups ranging from parity to 75% depending on workload, with an geometric average improvement around 16% across a wide range of realistic workloads. P95 latencies also improved significantly.

Trade-offs

ClickHouse Native format parsing and serialization is CPU-bound and still runs in a thread pool executor. The async benefit is in I/O concurrency i.e. overlapping network reads/writes with parsing, not in making the parsing itself async.
Non-streaming query results e.g. .query(), .query_df(), etc. are fully materialized in the executor before returning to the caller, which is the expected behavior for those APIs. Streaming variants like .query_rows_stream(), etc. are available for incremental processing.

Checklist

Unit and integration tests covering the common scenarios were added
A human-readable description of the changes was provided to include in CHANGELOG

…-asyncio

genzgd

This seems okay to me, although I can't claim to have done anything resembling a full review. A couple observations:

I'm curious as to where the improvements come from over the existing implementation, so I'm looking forward to that blog post.
There's a lot of duplicated code in the aiohttp_client. It would be nice to consolidate that somewhere.
The piece with the async queue is hard to follow -- I don't know out feasible it is, but it would be nice to remove that layer and just use some kind of async based generator without wrapping the extra queue.

joe-clickhouse · 2026-01-29T23:40:16Z

Thanks @genzgd. To address your questions:

I think the improvements are mainly from true I/O <-> CPU pipelining. In the existing async client we run the sync client in an executor, and it effectively does read -> parse -> read sequentially in a single thread. In the new client, an async producer reads from aiohttp and pushes chunks into AsyncSyncQueue while the parser runs in a separate executor thread. Those stages actually overlap, giving true concurrency.
Agreed on the duplication. I avoided refactoring the shared sync client pieces for now to keep the changes fully separate while the async path is still new. Once it stabilizes, I can move common logic into the base client to reduce duplication.
I did try for quite a while to use simpler async‑generator patterns, but we need to keep using the synchronous NativeTransform parser. If we do parsing directly on the event loop, we lose the async benefit because the CPU‑heavy parsing blocks the loop. The queue lets the event loop keep reading from the socket while parsing runs off‑loop. Additionally, it provides backpressure/bounded buffering.

genzgd · 2026-01-30T00:13:24Z

In the new client, an async producer reads from aiohttp and pushes chunks into AsyncSyncQueue while the parser runs in a separate executor thread. Those stages actually overlap, giving true concurrency.

If we do parsing directly on the event loop, we lose the async benefit because the CPU‑heavy parsing blocks the loop. The queue lets the event loop keep reading from the socket while parsing runs off‑loop. Additionally, it provides backpressure/bounded buffering.

Yes, as I think about it, that makes sense. It might be theoretically possible to run the sync HTTP client (and the buffer) in a separate thread than the parser, gaining a similar benefit. On a related note, making the transform step truly parallel would be challenging given the fact that HTTP chunks won't align with Native format blocks, but that's another argument in favor of a TCP protocol client. :)

…-asyncio

joe-clickhouse · 2026-02-16T06:44:42Z

For those interested, I have published a RC off this branch for testing and feedback: https://github.com/ClickHouse/clickhouse-connect/releases/tag/v0.12.0rc1

nathan-gage · 2026-02-17T16:39:40Z

For those interested, I have published a RC off this branch for testing and feedback: https://github.com/ClickHouse/clickhouse-connect/releases/tag/v0.12.0rc1

this is so great! we will be trying this in our staging environment and report back.

kbumsik · 2026-02-19T13:21:37Z

Hi, I have been testing v0.12rc and I got an interesting improvement with Opentelemetry Context propagation for Async Client. This is somewhat related to #303 :

Before: if tracing urllib3, a new root span created for each query() (which is bad) because urllib3 is in a separated thread executor, and the otel context is not automatically propagated to the executor.
After: Tracing aiohttp grabs a proper otel context automatically.

haydn-jones · 2026-03-07T17:58:33Z

Has largely been working well for me. I've noticed some intermittent server disconnect issues, but I suspect I'm the cause of these somehow.

…-asyncio

thewhaleking · 2026-03-27T10:43:31Z

Is this included in 0.15.0?

joe-clickhouse · 2026-03-27T14:44:47Z

Hi @thewhaleking, no, I cut 0.15.0 as kinda like the last release before the official roll to 1.0.0. I'll have a 1.0.0rc1 out sometime in the next few week or so that will include this. I did release 0.12.0rc1 w while back which you can grab from pypi if you wanted to try this out though.

joe-clickhouse added 20 commits January 12, 2026 16:45

implement native async client

34e0b89

add pylint exceptions

3971254

linting

d1f262d

improve test config setup

83ad886

fix some tests

4aff17d

enable log tables on startup

7392e2b

improve error handling

d96d1d5

linting

9019bc2

fix deprecation warning in aiohttp

759c736

error handling

4e050ab

separate sync/async stream failure tests

841c5d8

more error handling issues

fb31ccb

linting

b61a1d2

test updates and improvements

0f14167

test adjustments

74f5737

increase async stream buffer size

5781c52

don't parallelize cloud tests

6334be8

apply consistency settings to client factory fixture

e060dfd

up cloud par to 4

590fd12

enforce sequential consistency in helper

1b2578e

joe-clickhouse linked an issue Jan 15, 2026 that may be closed by this pull request

A database client should be based on asyncio #141

Closed

joe-clickhouse added 4 commits January 15, 2026 14:02

update enable_cleanup_closed vers

08abfee

add conn abort to wrong port assertion

85f0860

small test refactor

1f16500

more consistency

e95f61b

joe-clickhouse changed the title ~~Joe/141 a database client should be based on asyncio~~ Implement native async client Jan 16, 2026

joe-clickhouse added 4 commits January 16, 2026 15:30

add deadlock check in async queue

ceb672a

enforce consistent use of async iter stream consumption

038dfb5

linting

beaa28a

DO materialize non-streaming queries

45c96c1

joe-clickhouse requested a review from alex-clickhouse as a code owner January 21, 2026 00:40

joe-clickhouse requested a review from genzgd January 21, 2026 00:42

Merge branch 'main' into joe/141-a-database-client-should-be-based-on…

7319d8f

…-asyncio

genzgd approved these changes Jan 28, 2026

View reviewed changes

joe-clickhouse added 8 commits January 29, 2026 16:23

add pipelining to arrow methods

f4e0010

Merge branch 'main' into joe/141-a-database-client-should-be-based-on…

e362dc2

…-asyncio

Merge branch 'main' into joe/141-a-database-client-should-be-based-on…

aadd022

…-asyncio

feature parity with recent sync client changes

f65502a

fix jwt test after bad merge conflict resolution

46bc597

0.12.0rc1 release prep

6af89ee

fix assertion in racy test

7a62f64

clean up streaming source on exception

cb03466

kbumsik mentioned this pull request Feb 19, 2026

Enable passing OTEL trace context to client's query methods #303

Open

Merge branch 'main' into joe/141-a-database-client-should-be-based-on…

a5a949f

…-asyncio

joe-clickhouse requested a review from peter-leonov-ch as a code owner March 25, 2026 21:45

joe-clickhouse added 2 commits March 25, 2026 14:50

linting

4a2e74d

more linting

dd983aa

joe-clickhouse added the hold for 1.0.0 hold off merging until we're ready for 1.0.0 label Mar 25, 2026

joe-clickhouse added 3 commits March 26, 2026 13:13

Merge branch 'main' into joe/141-a-database-client-should-be-based-on…

4cf4c8a

…-asyncio

deprecate legacy async client and replace with async native

6885950

update changelog

8bb3d0a

joe-clickhouse merged commit 2956318 into main Mar 26, 2026
37 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement native async client#617

Implement native async client#617
joe-clickhouse merged 46 commits intomainfrom
joe/141-a-database-client-should-be-based-on-asyncio

joe-clickhouse commented Jan 15, 2026 •

edited

Loading

Uh oh!

genzgd left a comment

Uh oh!

joe-clickhouse commented Jan 29, 2026

Uh oh!

genzgd commented Jan 30, 2026

Uh oh!

joe-clickhouse commented Feb 16, 2026

Uh oh!

nathan-gage commented Feb 17, 2026

Uh oh!

kbumsik commented Feb 19, 2026 •

edited

Loading

Uh oh!

haydn-jones commented Mar 7, 2026

Uh oh!

Uh oh!

thewhaleking commented Mar 27, 2026

Uh oh!

joe-clickhouse commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

joe-clickhouse commented Jan 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why this change

Design

Native async I/O

Streaming bridge for ClickHouse Native format

Event loop safety

Lazy imports

Breaking changes

Migration

Tests

Performance

Trade-offs

Checklist

Uh oh!

genzgd left a comment

Choose a reason for hiding this comment

Uh oh!

joe-clickhouse commented Jan 29, 2026

Uh oh!

genzgd commented Jan 30, 2026

Uh oh!

joe-clickhouse commented Feb 16, 2026

Uh oh!

nathan-gage commented Feb 17, 2026

Uh oh!

kbumsik commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

haydn-jones commented Mar 7, 2026

Uh oh!

Uh oh!

thewhaleking commented Mar 27, 2026

Uh oh!

joe-clickhouse commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

joe-clickhouse commented Jan 15, 2026 •

edited

Loading

kbumsik commented Feb 19, 2026 •

edited

Loading